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Abstract 

Heavy-quark jets are important in many of today's collider studies and searches, 
yet predictions for them are subject to much larger uncertainties than for light 
jets. This is because of strong enhancements in higher orders from large logarithms, 
\n.{pt/mQ). We propose a new definition of heavy-quark jets, which is free of final- 
state logarithms to all orders and such that all initial-state collinear logarithms can be 
resummed into the heavy-quark parton distributions. Heavy-jet spectra can then be 
calculated in the massless approximation, which is simpler than a massive calculation 
and reduces the theoretical uncertainties by a factor of three. This provides the first 
ever accurate predictions for inclusive b- and c-jets, and the latter have significant 
discriminatory power for the intrinsic charm content of the proton. The techniques 
introduced here could be used to obtain heavy-flavour jet results from existing mass- 
less next-to-leading order calculations for a wide range of processes. We also discuss 
the experimental applicability of our flavoured jet definition. 



1 Introduction 

Studies of heavy- quark jets, i.e. charm and bottom jets, are important for a range of reasons. 
They are of intrinsic interest because charm and bottom are the flavours for which there 
exists the most direct correspondence between parton level production and the observed 
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Figure 1: Ratio of the measured inclusive 6-jet spectrum to NLO prediction. The mea- 
surement is performed for jets with transverse momentum 38 GeV < -Prjet < 400 GeV and 
rapidity |?7j e t| < 0.7. The plot is taken from ref. [3]. 



hadron level. They have the potential to provide information on the c- and 6-quark parton 
distribution function (PDF), which are the only components of proton structure that are 
thought to be generated entirely perturbatively from the DGLAP evolution of the other 
flavours. Furthermore, 6-jets enter in many collider searches, notably because they are 
produced in the decays of various heavy particles, e.g. top quarks, the Higgs boson (if 
light) and numerous particles appearing in proposed extensions of the Standard Model 
(SM)P. 

Within the SM a range of production channels exist for heavy-quark jets, e.g. pure QCD 
production or in association with heavy bosons (W, Z,H . . .), see e.g. [2]. The simplest and 
most fundamental measurement of heavy-quark jet production is the inclusive heavy-quark 
jet spectrum, which is dominated by pure QCD contributions. Predictions for this sort of 
quantity have always been obtained using calculations in which the c or b quark has been 
explicitly taken to be massive while all other lighter masses are neglected. 

An example is the inclusive 6-jet spectrum measured by CDF [3]. Fig. [T] shows the 
ratio of the experimental measurement to the next-to-leading order (NLO) calculation 
of [1]. A striking feature of this plot is the size of the theoretical (scale variation) un- 
certainties (~ 50%). One notes in particular that there is a significant region where the 
experimental uncertainties are smaller than the theoretical ones. Furthermore, the 6-jet 
theory uncertainties are considerably larger than the corresponding ones for the normal 
(light) jet inclusive spectrum (~ 10-20%), see for example [5]. 

The origin of the large theoretical uncertainties in fig. [T] can be understood by examining 
fig. [21 Its top panels show the i^-factor (NLO/LO) as obtained with MCFM for the 
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Figure 2: Top: f^-factor for inclusive 6-jet spectrum as computed with MCFM |10j . clus- 
tering particles into jets using the k t jet-algorithm [9] with R=0.7, and selecting jets in the 
central rapidity region (\y\ < 0.7). Middle: scale dependence obtained by simultaneously 
varying the renormalisation and factorisation scales by a factor two around pt, the trans- 
verse momentum of the hardest jet in the event. Bottom: breakdown of the Herwig [11] 
inclusive 6-jet spectrum into the three major hard underlying channels cross sections (for 
simplicity the small bb — > bb is not shown). 



Tevatron Run II (pp, y/s = 1.96 TeV, left) and for the LHC (pp, y/s = 14 TeV, right) 
The fact that the i^-factor is considerably larger than one indicates that the perturbative 
series is very poorly convergent, and implies that the NLO result cannot be an accurate 
approximation to the full result. It is for this reason that the scale dependence (middle 
panels) is large. One might think that a calculation with MC@NLO [12] should do better, 
since it includes both NLO and all-order resummed logarithmically enhanced terms. This 
turns out not to be the case, as can be seen from its persistently large scale dependenceJl 
Essentially, while MC@NLO contains a good matching between the NLO 6-production 
calculation and the 6-quark fragmentation logarithms in Herwig, it does not match with 

1 Fig. [I] has been obtained using a midpoint type [6] cone algorithm, however given the recent discover- 
ies jTJ |8] of infrared safety issues in midpoint cone algorithms, we prefer to illustrate our arguments with 
an inclusive fc t -algorithm |9J. In practice, we expect most features of the figure to be insensitive to the 
choice of algorithm, for example also with an infrared safe cone- type algorithm such as SISCone [5] . 

2 Poor numerical convergence prevented us from presenting the scale dependence for MC@NLO at the 
LHC. Note also that no X-factor has been shown for MC@NLO because the LO result is not unambiguously 
defined. 
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the logarithmic enhancements contained in Herwig for 6-quark production, but rather just 
replaces them with the NLO result. 

The poor convergence of the perturbative series is related to the different channels for 
heavy quark production. At leading order (LO), only the so-called flavour creation channel 
(FCR) is present, U — ► bb, where £ is a generic light parton (quark or gluon). At NLO, 
two new channels open up, often referred to as the flavour excitation (FEX) and gluon 
splitting channels (GSP)o In the former, a gluon from one of the incoming hadrons splits 
collinearly into a 66-pair and one of those 6-quarks enters the hard hi — > hi scattering. In 
the gluon splitting process, the hard scattering process is of the form It — > ££, and one of 
the final-state light partons (at NLO always a gluon) splits collinearly into a 66-pair (a jet 
containing both b and 6 is considered to be a 6-jet in standard definitions). The various 
channels can be conveniently separated with a parton shower Monte Carlo generator such 
as Herwig [11] , where one can determine the underlying hard channel from the hard process 
in the Herwig event recordQ Their relative contributions to the total 6-jet spectrum are 
shown in the bottom panel of fig. [2j One sees that the supposedly LO channel (FCR) is 
nearly always smaller than the two channels that at fixed order enter only at NLO (FEX 
and GSP). This is because both NLO channels receive a strong enhancement from collinear 
logarithms, going as o> 2 s (a s ln(p t /mb)) n for flavour excitation [13] and a 2 ■ a™\ia 2n ~ 1 (p t /mi ) ) 
for gluon splitting (n > 1) [14] . 

Three approaches come to mind for increasing the accuracy of the 6-jet spectrum pre- 
diction. The most obvious (and hardest) is to carry out the full massive next-to-next-to- 
leading order calculation. Aside from being beyond the limit of today's technology, such 
an approach would still leave many of the higher order logarithms uncalculated and so 
would only partially improve the situation. A second approach would be to carry out the 
explicit resummation of both the incoming and outgoing collinear logarithms. The technol- 
ogy for each resummation on its own is well-known at next-to-leading logarithmic accuracy 
(NLLA) [T31 El] , though significant effort would probably be necessary to assemble them 
together effectively. In both of the above approaches, the largest residual uncertainties are 
likely to be associated with the channel with the most logarithms, gluon splitting. This 
channel however does not even correspond to one's physical idea of a 6-jet, i.e. one induced 
by a hard 6-quark and it seems somehow unnatural to include it at all as part of one's 6-jet 

3 It is sometimes stated that it makes no sense, beyond LO, to separately discuss the different channels, 
for example because diagrams for separate channels interfere. However, each channel is associated with 
a different structure of logarithmic enhancements, ln n (p t /mb), and so there is distinct physical meaning 
associated with each channel. Furthermore one can give a precise, measurable definition to each channel, 
e.g. using an exclusive variant of the flavour jet algorithm discussed below. Though there will be some 
arbitrariness in any such definition, relating to the choice of parameters of the jet algorithm, this arbitrari- 
ness is no more troubling conceptually or practically than the jet-definition dependence that arises when 
determining the number of jets in an event. 

4 The use of Herwig to label the flavour channel does not correspond to a directly measurable definition, 
however Herwig does have the correct logarithmic enhancements for each channel, and furthermore gives 
results rather similar to those based on a flavour-channel classification using the algorithm of section [2] 
with R = 1 (the value of R that places initial and final-state radiation on the same footing for k t type jet 
algorithms) . 
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spectrum. 

We therefore propose a third approach to improving the accuracy of the prediction of 
the 6-jet spectrum. It is a two-pronged approach. Firstly, one uses a definition of 6-jets 
which maintains the correspondence between partonic flavour and jet flavour. Specifically, 
we take the flavour-/^ algorithm of [IS]. Within this algorithm, described in section El a 
jet containing equal number of b quarks and b antiquarks is considered to be a light jet, so 
that jets that contain a b and b from the gluon splitting channel do not contribute to the 
6-jet spectrum. The use of this kind of algorithm already leads to some reduction of the 
theoretical uncertainty on the 6-jet spectrum with a standard massive calculation (e.g. with 
MCFM). Further improvement can be obtained by exploiting the fact that the logarithms 
of pt/rrib that remain are those associated with flavour excitation, which coincide with 
those resummed in the 6-quark parton distribution function (PDF) at scale p t . If one uses 
a 6-quark PDF to resum these logarithms, no other logarithms ln(pt/mi,) appear in the 
rest of the calculation, so that one can safely take the limit — > and one misses only 
corrections suppressed by powers of (rrib/pt) 2 (possibly with additional logarithms). The 
validity of this procedure is a consequence of the infrared safety of the jet-flavour even in the 
massless limit (see later)! This third approach is therefore the one which is technically the 
easiest to pursue and which should simultaneously reduce the theoretical uncertainties the 
most. In section[3]we present results for c-and 6-jets using this method. A similar approach 
can be used also in different contexts, e.g. recently the flavour-algorithm of [15j has been 
used to define the e + e~ forward-backward asymmetry for b in an infrared-safe way, making 
it possible to compute this quantity at NNLO using a massless QCD calculation [17] . 

Several issues deserve detailed discussion in the above approach. Firstly for moderate 
values of p t (or of the jet energy in e + e~), finite-mass effects may not be completely 
negligible. It is therefore important to determine their size. We explain briefly how this 
can be done in section [3l with further details given in appendix |A] A second issue is an 
experimental one related to the limited efficiency for the identification of B and D hadrons. 
Though not strictly within the remit of a theoretical paper, we do find it useful to discuss 
various points related to this issue in section HI Finally we also comment on the question 
of electroweak effects, in appendix [B] 

2 The heavy- quark jet algorithm 

In general, flavour-algorithms provide an IR-safe definition of the flavour of a jet, provided 
one knows the (light or heavy) flavour of each parton involved. However to study heavy- 
quark jets it is not necessary to know the flavour of light quarks, because gluons and light 
flavoured quarks can be considered as flavourless, while one assigns to heavy (anti)-quarks 

5 We note that such a '5-flavour scheme', with resummed 6-quark PDFs has been used before in MCFM 
for H + b and Z + b production [15] . In that case, because a non-flavour jet algorithm was used, it was 
necessary to supplement the results with an explicit massive calculation of the NLO gluon-splitting process. 
We thank John Campbell for bringing this to our attention. 
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a flavour 1 (-1). We define the heavy-flavour of a (pseudo)-particle or a jet as its net heavy- 
flavour content, i.e. the total number of heavy quarks minus heavy antiquarks. One may 
alternatively use the sum of the number of quarks and anti-quarks modulo 2. Flavourless 
(flavoured) objects are then those with (non-)zero net flavour. We present here the inclusive 
version of the heavy-flavour jet algorithm for hadron-hadron collisions, referring the reader 
to [15] for the motivation of the formalism (as well as the original exclusive formulation): 

1. For any pair of final-state particles i, j define a class of longitudinal boost invariant 
distances d\J' a ^ parametrised by < a < 2 and a jet radius R 

,(F,a) _ A-Hij + ^% / max(fc (i , k t j) a mm(k ti , k t j) 2 ~ a , softer of i, j is flavoured, 



% i R 2 \ min(fc^, k 2 j) , softer of i,j is flavourless, 

(1) 

where Ay^ = yi — yj, A0jj = 4>i — (fij and k ti , y, L and (pi are respectively the transverse 
momentum, rapidity and azimuth of particle i, with respect to the beam. 

For each particle define a distance with respect to the beam B at positive rapidity, 



d {F ' a) = { max ( k ti' k tB(yi)) a min(k t i,k tB (yi)) 2 a , i is flavoured, 
lB 1 min(/c^, k 2 B {y,j)) , i is flavourless, 



(2) 



with 



Mi/) = E kti -y) + e ^ - y^~ y ) ■ ( 3 ) 



Similarly define a distance to the beam B at negative rapidity by replacing kts in 
eq. d2J) with k t g 

hsiv) = E k « ( & (y - y*) + - yy~ m ) ■ ( 4 ) 

i 

Identify the smallest of the distance measures. If it is a d\p , recombine i and j 

into a new particle, summing their flavours and 4- momenta; if it is a d[^' a ^ (or d^°^) 
declare i to be a jet and remove it from the list of particles. 

Repeat the procedure until no particles are left. 



Sensible values for a are 1 or 2 [15] and R should both be kept of order 1, to avoid the 
appearance of large logarithms of R. 

The IR-safety of this algorithm was proved in [15]. A general consequence of IR- 
safety is that it allows one to take the limit tuq —>■ (any finite-mass corrections being 
suppressed by powers of rrig/pt 2 ) as long as collinear singularities associated with incoming 
heavy quarks are factorised into a heavy quark PDF. This means that we can compute 
heavy-quark jet cross sections using a simpler, light-flavour NLO program, rather than a 
heavy-flavour one [18]. Furthermore IR and collinear safety ensure that one obtains the 
same results whether one considers heavy-quark flavour at parton level, or heavy-meson 
flavour at hadron level, modulo corrections suppressed by powers of Kqcd/pi- 
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Figure 3: Inclusive jet spectrum at the Tevatron (right) and at the LHC (left). The top 
two panels show results for both 6-jets and all-jets, while the lower three panels apply only 
to 6-jets. See text for further details. 

3 Results 



In Fig. [3]we present the inclusive 6-jet prspectrum as obtained with the flavour algorithm 
specified above. We have used the jet-algorithm parameters a = 1, and R = 0.7, the latter 
having been shown to limit corrections associated with the non-perturbative underlying 
event [5]. The left (right) column of the figure shows results for the Tevatron run II (LHC). 
We have selected only those jets with rapidity \y\ < 0.7. We also show the full inclusive 
jet spectrum (all jets) as obtained with a standard inclusive /^-algorithm with R = 0.7. 

The spectra have been calculated using NLOJET [T^]. The publicly available version 
sums over the flavour of outgoing partons. We therefore had to extend it so as to have access 
to the flavour of both incoming and outgoing partons. We fixed the default renormalisation 
and the factorisation scales to be Pt, the transverse momentum of the hardest jet in the 
event and chose as a default PDF set CTEQ61m [20]. We also used the a posteriori PDF 
library (APPL) of [21], together with the HOPPET [22] and LHAPDF [23J packages to 
allow us to vary scales and PDF sets after the NLOJET Monte Carlo integration. 

The figure shows the inclusive jet spectrum at LO (blue, dashed) and at NLO (red, 
solid) for all jets and for 6-jets. The 6-jet cross section is always a few percent of the 
light jet one. The .fT-factor, the ratio of NLO over LO cross-section is shown below and is 
similar (between 1.15 and 1.4) for light and 6-jets, both at the Tevatron and at the LHC. 
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Figure 4: Inclusive jet spectrum at the Tevatron (right) and at the LHC (left) for generic 
jet production and for c-jet production. See text for further details. 



To provide an estimate for the theoretical uncertainty we vary separately the factorisation 
and the renormalisation scale in the range l/2P t < fi R , /ip < 2P t . The band associated 
with this variation is shown in the plots below. We see that this is at most a 15% effect 
in the region considered. We note that our procedure is more conservative than the usual 
simultaneous variation of hr and /if (as done in figures [1] and [2]) . 

We have also calculated (but do not show) the 6-jet spectrum for our definition of 
heavy jets using a massive NLO calculation with MCFM [10]. We find that the results 
are consistent with those from the massless calculation, though the uncertainties in the 
massive calculation are much larger, only slightly smaller than those in fig. [2j 

Though the massive calculation is not itself of much direct interest given its significant 
uncertainties, it does enable one to estimate residual finite-mass effects, via the relation 



d<Jb 
dp t 



daf°' iet , 

+ hm 



d MCFM 



m=rrn, 



dp t 



mo- 



o I dp t 



da™ CFM 



dp t 



m=mo 



+ C( Pt )ln^). ([,) 
m 



Here, the contents of the bracket corresponds to the evaluation of the difference between 
the result for the true mass and the massless limit, while subtracting logarithms such 
that the massless MCFM calculation is effectively being carried out with a coupling and 
fe-PDF that are mass-independent at scale pt. Further details and the form for C{j)t) are 
provided in appendix [A] The relative size of the residual finite mass effects is shown 
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in the penultimate panel of fig. [31 They decay somewhat more slowly with p t than the 
naive expectation of jp\ (a feature noted before in [23]), perhaps because they have 
logarithmic enhancements. Nevertheless they are always below 6% and given their modest 
size compared to the massless perturbative uncertainties, we choose not to explicitly add 
them to the main NLOJET results. 

To illustrate the dependence on the parton densities we show in the bottom panel of 
fig. El the effect of using all members of the CTEQ61 [20] and MRST2001E [25] PDF sets, 
relative to the default CTEQ61m choice]^ We see that the effect is always moderate at the 
Tevatron (< 20%), while it is large at the LHC in the high p t region, presumably because 
the b and gluon PDFs are not well constrained in that region. 

We have also calculated the spectrum for charm jets and the results are shown in 
figure HI We omit the panel showing finite-mass effects because of the low charm quark 
mass. The most notable difference relates to the PDF dependence. There has been some 
discussion of a possible intrinsic charm (IC) component of the proton and a recent analysis 
provides PDF sets, CTEQ65c, with various models for such a component, see [28] and 
references therein. One sees that at moderate p t these sets suggest that there is up to 40% 
uncertainty in the charm jet spectrum and at higher p t the uncertainty reaches a factor 
two. Further investigation reveals that the moderate p t uncertainty is related to a possible 
sea-like IC component. In the sea-like scenario considered in [28] it was assumed that 
charm and anti-charm are distributed as the up and down sea components in the proton. 
At higher p t the uncertainty is due to the valence-type models for IC considered in [28] . 
specifically the original BHPS light-cone model [29], and a meson-cloud picture [30] in 
which the IC arises from virtual low-mass meson+baryon components of the proton. 

Let us now return to the question of theoretical uncertainties in our predictions, specif- 
ically the scale dependence. Fig. [5] shows the ratio 



r( Wt ) = dp Z "^7 , (6) 



for the inclusive and heavy-quark jet cross-sections in various pj-bins. The factorisation 
and renormalisation scales are varied simultaneously, fiR = = H = x^P t . At low p t at 
the Tevatron and at intermediate pt at the LHC the scale dependences are quite different 
at low values of x M (< 0.5) due to the dominance of different partonic channels. However, 
the sensitivity, i.e. the dependence of r{x^,pt) on x M remains always of the same order for 
heavy-quark jets and all jets. The charm ratio is generally intermediate between the b and 
all-jet ratios, as is natural given the relative masses of the charm and bottom quarks. 

The fact that the scale dependences are similar for all and heavy jets in many of the 
p t bins, suggests that if one considers the ratio of heavy to all jets a significant part of 
the theory uncertainties may cancel. Additionally, a number of experimental uncertainties 
may cancel, for example part of the jet energy scale and luminosity dependence. 



6 We have also examined the CTEQ65 [26], MRST2004nlo and MRST2004nnlo [27] sets and found 
similar results. 
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Figure 5: Ratio of spectrum at factorisation and renormalisation scale [Ir — fip — x^Pt 



and at ijlr = Hf = Pt- 

Accordingly in fig. [6] we show the ratio of h- and c-jet spectra to the all-jet spectra. 
The ratio is always of the order of a few percent and is somewhat larger for c-jets than 
for 6-jets, as is to be expected given the larger charm PDF. At higher p t it increases at 
the Tevatron and decreases at the LHC due to the different behaviour of the PDFs in the 
range of x and Q 2 probed by the two different machines. In particular, at large x all-jet 
spectra are dominated by channels with valence incoming quarks. The same is true at the 
Tevatron for heavy-quark jets, where the main high-p f production channel is qq — > QQ. 
At the LHC, on the contrary, high-j^ heavy quarks are produced mainly via Qq — > Qq 
processes, so that heavy-jet spectra are suppressed by the heavy-quark PDF. 

The lower panels of fig. [6] show the uncertainty associated with the variation of factori- 
sation and renormalisation scales and the PDFs. The scale dependence is reduced in the 
whole p t range compared to that for the heavy-jet spectra. This is especially the case at 
large pt (cf. fig. [5]). The PDF dependence is also reduced except in the case of charm jets 
using PDFs with an intrinsic charm component, CTEQ65c. 
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Figure 6: Top: ratio of 6-jet to inclusive jet spectra at the Tevatron and at the LHC. 
Bottom: ratio of the c-jet to inclusive jet spectra. Further details are provided in the text. 
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4 Experimental issues 



The main outstanding question is that of the experimental measurability of heavy flavour 
jets as defined here. We examine this specifically for 5-jets, since they have been much 
more widely studied. We will comment briefly on c-jets at the end of the section. 

The question of the experimental measurability of 6-jet definition can only truly be 
settled by a detailed experimental study. However several points lead us to believe that 
such a measurement might well be possible. Our discussion here is inspired in part by that 
in |31j . which measured BB azimuthal correlations at the Tevatron, including the region 
of small angular separation between the B and B, which is the experimentally non-trivial 
region also for our definition of fe-jets. One should be aware in the discussion below that 
the correspondence between our needs and what was done in [31] is only partial, insofar 
as the measured 5-hadrons were not used as inputs to a jet algorithm, and also had lower 
typical transverse momenta than would the 5-hadrons in 6-jet studies!^ 

In an ideal world the input to the jet flavour algorithm would be a list of momenta of 
all particles in the event together with information about which particles correspond to a 
.B-hadron. We are allowed to use S-hadrons rather than b quarks in the algorithm because 
the flavoured jets are infrared and collinear safe and the fragmentation of a b quark into a 

5- hadron should have no more effect than collinear radiation from the b quark. 

Experimentally one has information on charged tracks and their momenta, calorimeter 
energy deposits, and 6-tags. The latter typically exploit the long lifetime of 5-hadrons, 
which causes the B-hadron to decay some small but measurable distance away from the 
primary interaction vertex of the event. If the B-hadron decay products include two or 
more charged particles then a secondary vertex may be identified from the intersection of 
the resulting charged tracks, whereas if the decay involves only one charged track then one 
may still obtain a 6-tag based on the finite impact parameter between that track and the 
primary vertex. Often the 6-tagging is restricted to tracks within hard jets, so as to reduce 
certain backgrounds. 

Current 6-tagging abilities don't correspond to our 'ideal world' scenario for a variety of 
reasons. Firstly, since one often sees only a subset of the B-hadron decay products, one does 
not know the 5-hadron momentum. This should not matter since the jet algorithm will 
in its first steps recombine the observed charged tracks in the decay with the calorimeter 
energy deposits from the neutral particles (other than neutrinos) in the decay. 

A second problem is that whereas experiments first search for jets and then do the 

6- tagging, we need the information on 6-tags before running the jet algorithm. This should 
not be a major obstacle: one may first identify jets using a standard k t or cone algorithm, 
with large radius parameter (so as to catch most 6's, as done in [31]), carry out the b- 
tagging, and then run the flavour algorithm using that information. 

7 At higher p t 's the fraction of bb pairs at small angles will be increased, making the analysis more 
difficult, on the other hand the secondary vertices will be displaced further from the primary vertex and 
this should facilitate the analysis. The extent to which these two effects cancel can only be determined by 
a full experimental study. 
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The third and potentially most serious issue relates to the finite efficiency for 6-tagging, 
and notably for double 6-tagging inside a single jet. The efficiency for 6-tagging is limited 
for various reasons: partly because of the need to place cuts on impact parameter to 
avoid backgrounds from decays of charm hadrons, which also decay a small but measurable 
distance from their production vertex (such backgrounds are partially reduced also by using 
the invariant mass of the decay products); and partly because of issues related to detector 
limitations. Double 6-tagging for a pair of 5-hadrons that are close in rapidity and azimuth 
(i.e. in the same jet) is considered particularly difficult, because of the need to be sure that, 
if one sees two secondary vertices in a jet, they aren't 'sequential tags' from a single b, i.e. 
the vertex from a 5-hadron decaying to a Z)-hadron plus other particles, followed by the 
vertex from the D decay. Double 6-tagging inside a single jet is nevertheless possible, albeit 
currently with limited efficiency, as has been shown in |31j. This is important because our 
algorithm relies on jets with with two 6's inside being identified as light jets. 

To evaluate the impact of finite efficiencies, we consider the following simple model. We 
suppose the efficiency for tagging a single B-hadron to be x, and the efficiency for tagging 
two £>-hadrons that are well-separated (i.e. in separate jets) to be x 2 . Typical values for 
x are ~ 0.5. In contrast the probability of tagging two nearby S-hadrons is taken to be 
yx 2 (while the probability for tagging neither is (1 — x) 2 ), with y ~ 0.2 [31] a measure 
of the extra difficulty of tagging two nearby .B-hadrons. If, in a given bin, the number of 
true 6-jets is T and the number of jets containing bb due to gluon splitting is G, then the 
measured number of single-tagged 6-jets will be^l 

t = xT + x(2- (l + y)x)G. (7) 

The contamination due to single-tagged gluon splitting is found by taking one minus the 
fraction of gluon-splitting jets where neither b has been tagged, or where both 6's have 
been tagged, x(2 — (1 + y)x)G = (1 — (1 — x) 2 — x 2 y)G. The measured number of light, 
'gluon-splitting', jets with double b tags will be 

g = x 2 yG. (8) 

It is straightforward to deduce T from measurements of t and g, 

T= «_2 Z (H J 0« (g) 
x x z y 

as long as one knows the efficiencies x and y. In practice those efficiencies will be imperfectly 
known, with uncertainties Sx and 8y, and the effect of the estimated efficiency, used in 
eq. 0, being different from the true efficiency, eqs. ([7]), ([8]), will be an error 5T on the 
determination of T, 



5T 2 



(2G-T) — 

x 



2 



+ 



G(2-x) 5 -y 

y . 



2 



(10) 



8 We ignore the potential effect of a flavour-mistag on the kinematics of the jets. This should be justified 
since the differences between a flavour kt and a normal kt algorithm are at the level of a few percent in 
the spectra, and in the absence of flavour information the flavour k t algorithm just behaves like a normal 
k t algorithm. 
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where we assume the uncertainties on x and y to be uncorrelated. Since G and T are of 
the same order of magnitude (cf. fig. [2]), the uncertainty on T is essentially given by the 
relative uncertainties on x and y. If these can both be controlled to within 109cD then 
for G ~ 0.75 T, as we have at the Tevatron for p t ~ 100 GeV, the relative uncertainty 
on T should be roughly 12% (for x ~ 0.5). For an integrated luminosity of 2ib _1 there 
are ~ 10 5 events in a bin of width 10 GeV centred at p t = 100 GeV, so statistical errors 
will be considerably smaller than this, and they are dominated by the relative error on 
g = x 2 yG. Only at higher energies, when g starts to be small, will the enhancement of 
relative statistical errors due to the limited tagging efficiencies start to matter. 

The above discussion is of course somewhat simplistic. In reality, single and double 
6-tagging efficiencies may vary with rapidity, azimuthal separation and transverse momen- 
tum, though this ought to be possible to account for; one should also correct for impurities 
in the 6-tag samples - based on the uncertainties for the azimuthal correlations given in 
[31], this may be roughly equivalent to doubling the uncertainty on y; and a number of 
other experimental uncertainties will also contribute, such as energy scale uncertainties. 
On the other hand, steady progress is being made in 6-tagging techniques [321 ESI EI] • One 
also wonders whether the knowledge that a second b is present somewhere in the event can 
be used in conjunction with a loose second 6-tag, so as to obtain information about where 
the second b is most likely to be (in the same jet, in another jet, or down the beam-pipe), 
giving an effectively larger value for y (possibly even > 1). This might be important par- 
ticularly when statistics are limited, e.g. at high p t and also potentially when using flavour 
information in new-particle searches. 

Finally, as concerns c-jets, though they have been the subject of far fewer investigations, 
we do note that double-tag samples also exist for charmed hadrons [35] and that some of 
the studies on 6-tagging [32] also provide information on charm flavour, suggesting that 
c-jet studies may also be possible. As for 6-jets, a critical issue in a good measurement of 
the charm jet spectrum will be not so much that of obtaining high tagging efficiencies, but 
rather of a good understanding of those efficiencies even if they are low. 

5 Conclusions 

The key finding of this article is that if one uses a properly defined jet-flavour algorithm 
and exploits its infrared safety to take the massless limit, predictions for heavy-quark jet 
spectra can be made substantially more accurate than those based on current definitions 
and NLO massive calculations (e.g. MNR rjg], MCFM [10. or MCONLO [12J). When 
quantified in terms of scale dependence, the QCD theoretical uncertainty is reduced from 
30 — 50% to 10 — 20%. This is because large higher-order logarithms that first appear 
at NLO in the massive calculation are either cancelled by the jet definition itself, or else 

9 The most delicate is y, and from table III of [31], which contains a breakdown of sources of systematic 
error (including that on the relative efficiency for tagging two nearby 6's compared to two well separated 
6's), it seems that 10% is a reasonable value for the uncertainty on y. 
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absorbed into the heavy-quark PDF in such a way as to become part of the leading order 
contribution, so that the NLO term is truly a perturbative correction. 

Measurements of the heavy-flavour jet spectra as presented here would be of inter- 
est for a range of reasons. Heavy-flavour jet spectra measured so far do not distinguish 
between 'true' heavy-flavour jets and gluon jets that fragment to QQ. Our definition in- 
stead provides just the true flavoured-jet component. Thus for the first time not only is 
the momentum of a hard parton a meaningful observable quantity (as defined by the jet 
algorithm), but so is its flavour. 

More generally, heavy-flavour jets, in particular fe-jets, are used in a variety of contexts, 
including PDF measurements, top quark studies, and searches for new particles. These can 
only benefit from a properly defined jet flavour. One example seen in section 3 is for the 
charm PDF: current measurements leave considerable room for a non-perturbative 'intrin- 
sic' charm component in the proton, and given an experimental accuracy that matched the 
theoretical accuracy of our charm-jet predictions, significant constraints could be placed 
on this intrinsic component. Similarly, a measurement of VF+c-jet production could help 
constrain the strange quark PDF [36J. 

To calculate the heavy-flavour jet spectra shown here, we used NLOJET. By default it 
sums over the flavours of outgoing partons, so we modified it so as to be able to disentangle 
the flavour information. Though not completely trivial, this was quite a bit simpler than 
writing a new NLO Monte Carlo program for a massless process, and very much simpler 
than writing the corresponding heavy-flavour Monte Carlo program. One could analogously 
extract the flavour information from the many other NLO Monte Carlo programs involving 
massless QCD particles, thus providing heavy-flavour jet predictions in a range of processes. 
The usefulness of the flavour information is such that we strongly encourage NLO (and 
NNLO) Monte Carlo authors to provide it by default o 

To supplement the massless calculation, we also investigated residual effects associated 
with the finite value of the 6-quark mass. For jets with p t > 50 GeV they were of the order of 
5%, falling off rapidly at higher p t . This was the most laborious part of our study, however 
given the small size of the effects we believe that it should be safe to neglect them in future 
NLO calculations of heavy-flavour jets for other processes. Only when considering NNLO 
heavy-flavour jet predictions, or low values of p t at NLO, should it become mandatory to 
include finite mass effects. 

The main open question remains that of the experimental usability of our jet-flavour 
algorithm, mainly because of its reliance on the correct identification of situations where 
a jet contains both a B (D) and a B (D) hadron. As discussed in sec. HI given reasonable 
relative uncertainties on single and double-tag efficiencies, we believe that it ought to be 
possible to make an experimental measurement with errors that are not disproportionate 

10 That usefulness extends beyond the framework of the jet-flavour type algorithm used here. For example 
to improve the prediction for the current experimental definition of fr-jets, one could use the prediction 
given here as a starting point and supplement it with an NLO (a? + a*) calculation of the difference 
between the experimental definition and ours, which starts only at O (ctf) ■ In principle, given the recent 
NLO calculation of the QQ+jet cross section [37], the technology already exists for such a combination. 
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compared to theory uncertainties. For the case of B hadrons, ongoing improvements in 
flavour tagging techniques, together with the use of 'loose' tagging to identify the second B 
hadron in an event where a first B hadron has already been found, might help further. We 
look forward therefore to future experimental investigations of heavy-flavour jet spectra 
with the definition presented here. 
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A Finite mass effects 

Given the small theoretical errors in the predictions for heavy-quark spectra when an 
infrared safe algorithm and massless calculation are used, it is important to make sure 
that the error due to the massless quark approximation remains smaller than the quoted 
theoretical errors even at moderate values of p t . In this appendix we explain how O (a 2 a ) and 
O (ot z s ) finite-mass effects can be extracted from the massive NLO calculation in MCFM. 

The procedure consists in subtracting from the full, massive NLO result the collinear 
logarithms which with a massless calculation are resummed into heavy-quark PDFs, any 
remainder being due to finite mass effects O (mg/pt 2 ) potentially enhanced by logarithms. 
The heavy-quark production mechanisms that can give rise to collinear logarithms are 
flavour excitation and gluon splitting. However, if an infrared safe algorithm is used the 
only logarithmic enhancements that survive are those associated with flavour excitation. 

We denote generally by u(mQ) any heavy-quark jet cross section corresponding to a set 
of kinematic cuts and study its dependence on the heavy-quark mass mg by considering 
Aa(mQ,mo) = ct(toq) — a (mo), where mo is an arbitrary reference mass. 

When three partons are produced in the final state (NLO real contribution) the loga- 
rithmic enhanced contribution Aa(mQ,m ) due to FEX is given by 

2 



a.s , m 



Aa(m Q ,m ) ~ — In —j- x 



2tt m 2 Q 



x / dx\dx2 



[PQg®g)(xi)g(x2) & ( Q ) g ^ Qg (x 1 ,x 2 ) + g(x 1 )(P Qg (E) g)(x 2 ) & { g °^ Qg (x 1 ,x 2 ) + 
( p Qg®9) (xi)q(x 2 ) cr { Q q ^ Qq (xi, x 2 ) + q(x 1 )(P Qg <8> g)(x 2 ) ^ g Q^ Qq (xi,x 2 )\ , (11) 
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Figure 7: Various contributions to the inclusive cross section for b jets with p t > 50 GeV 
and \y\ < 0.7 at the Tevatron, as a function of the heavy-quark mass uiq. The points are 
from a massive calculation using MCFM, while at NLO the lines are given by the slopes 
in eqs. (jlljl2p . with a constant term adjusted so as to match the massive calculation at 
m Q = 0.5 GeV. 



where the contributions in the first line are due to diagrams where the hard scattering 
process is Qg — > Qg, while terms in the second line correspond to diagrams where the hard 
scattering process is Qq — > Qq and cr^ cd (xi, x 2 ) denotes the Born partonic cross section 
for the process ab — > cd as a function of the incoming energy fractions X\,x%. The sums 
over light-quark flavours (and over quarks and antiquarks) are implicit. 

In the case of NLO virtual corrections, for calculations in which the heavy-quark flavour 
is decoupled both in the running coupling and the PDFs, the only logarithmically enhanced 
contribution comes from the subprocess qq — ► QQ: 

At \ 2 a s T R ml (0 ) , , 

In the gg — > QQ subprocess, logarithmically enhanced virtual corrections from the renor- 
malisation group evolution of the coupling and the gluon distribution cancel. 

As an example, in fig. [7] we plot cr(mg), the integrated inclusive p t spectrum at the 
Tevatron for p t > 50 GeV and \y\ < 0.7, as a function of rrtq for the real and virtual NLO 
contributions, 0(a^), as given by MCFM. We also show the LO result for reference. 

We see that in the small mass region, the cross sections computed with MCFM are well 
approximated by cr(m ) + Aa(rriQ, m ), where Aa(rriQ, m ) are the finite- mass logarithmic 
contributions in eq. ffTT]) and eq. ffTJZj) . where the integration has been performed numerically 
with CAESAR We obtain similar results at the LHC. Finite- mass effects for the 

inclusive p t spectra can then be computed by considering the difference p t da{mQ) / dp t — 
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Ptda(mo)/dpt, where niQ is as close to zero as numerically possible given the presence of 
small-mass instabilities in the NLO calculation (we choose m = 0.2 GeV at the Tevatron 
and mo = 1.0 GeV at the LHC) and subtracting all collinear enhancements predicted from 
eqs. (II II) and (fl2l) . The results of this procedure is what is presented for 6-jets in the 4th 
panel of fig. [3] In this manner, we obtained the coefficient to C(p t ), as used in eq. (jSJ): 

C(p t ) In — = — ^-Aa(m b , m ) . (13) 
m dp t 



B Electroweak corrections 



There has been discussion in the recent literature [391 HQl HH H2] of potentially large 
electroweak (EW) corrections to QCD light and heavy (top) dijet cross sections. Generally 
speaking there is consensus that these effects should be modest (< 5%) at the Tevatron, 
but it is not uncommon for effects of up to 40% to be quoted at the upper end (4 TeV) of 
the p t reach of the LHC. 

Two kinds of issues need to be addressed. Firstly there are effects that apply equally to 
inclusive and flavoured jet cross sections: it has been known for some time now [4"3l 14*4"] |4"5] 
that electroweak loop corrections for high-^ processes involve enhancements proportional 
to a% w \vL 2n {p t /Mw)- Such terms are analogous to Sudakov double logarithms in QCD, 
with the difference that the masses at the electroweak scale regulate the infrared and 
collinear divergences. Because of their double logarithmic structure they become large at 
high p t , and they are the main culprits in the 40% effects quoted in [39] at the high end of the 
LHC reach (4 TeV). A point emphasised there is that a phenomenological understanding 
of the impact of EW effects also requires that one consider the experimental treatment 
of real EW radiation. Ref. [40] examined isolated W and Z radiation and found that it 
compensated for about a quarter of the loop effects. However, the dominant real radiation 
contribution should come from (soft) collinear W and Z emission, and it is to be expected 
that this will compensate a significant remaining part of the loop effects. 

A second issue arises specifically when considering flavoured jets, because by isolating 
a given flavour one breaks the electroweak SU(2) symmetry: while the emission of a soft 
W boson has little effect on the energy of the jet and so should largely cancel with corre- 
sponding virtual corrections in the inclusive jet spectrum, if the W is emitted from a 6-jet, 
it will convert it into a top-quark jet [46J. This is often referred to as Bloch-Nordsieck 
violation [43J and may lead to significant double logarithmic EW corrections for the very 
highest p t flavoured jets at the LHC. As for inclusive jet analyses, the details of the ex- 
perimental treatment are likely to be crucial, since the flavour attributed to the jet will 
depend on whether the top quark is reconstructed or whether it is only the 6-hadron from 
the t — > b + W decay that is identified. For charm jets the experimental situation will 
be different insofar as the real EW emission process is c — > s + W, and strange hadrons 
do not decay back to charmed hadrons! The question of flavour-changing EW effects is 
relevant also for the gluon splitting process, e.g. g — > cc, where one of the charm quarks 
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may then emit a W, giving a jet with a net charm flavour. If the W is not identifi- 
able experimentally, then at high p t at LHC this process, which has enhancements of the 
form a^a^^/ln m h n (p t /Mw), may give a non- negligible contribution to the charm-jet 
spectrum. 

For both b and c jets, if the experiments prove to be able to measure heavy flavour at 
these high p t values, then it will become important to examine the above issues in more 
detail. 
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